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Studies of copy number variation (genomic imbalance) are providing insight into both complex and Mendelian genetic 
disorders. Array comparative genomic hybridization (array CGH), a tool for detecting copy number variants at a resolution 
previously unattainable in clinical diagnostics, is increasingly used as a first-line test at clinical genetics laboratories. Many 
copy number variants are of unknown significance; correlation and comparison with other patients will therefore be essen- 
tial for interpretation. We present a resource for clinicians and researchers to identify specific copy number variants and 
associated phenotypes in patients from a single catchment area, tested using array CGH at the SE Thames Regional Genetics 
Centre, London. User-friendly searching is available, with links to external resources, providing a powerful tool for the 
elucidation of gene function. We hope to promote research by facilitating interactions between researchers and patients. 
The BBGRE (Brain and Body Genetic Resource Exchange) resource can be accessed at the following website: http://bbgre.org 

Database URL: http://bbgre.org 



Background and significance 

Clinical genetic laboratories are increasingly using high- 
resolution microarray-based genomic analyses, usually 
array comparative genome hybridization (aCGH), as first- 
line diagnostic tests to detect copy number variants 
(CNVs) in individuals referred for congenital abnormalities 
and developmental disabilities (1). As well as improving 
genomic diagnosis overall, these high-resolution analyses 
have enabled discovery of novel CNVs associated with 
genetic syndromes and complex disorders such as autism, 
epilepsy and intellectual disability (2, 3). Despite the many 
successes so far, there is a need to better exploit the volume 
of clinical and genomic data that is being accumulated in 
diagnostic genetic laboratories (typically thousands of 



patients tested per year per centre) and to make these 
data available to the wider scientific and clinical genetic 
community. 

A number of resources are already available to the com- 
munity: CHOP CNV holds CNV data derived from a study of 
~2000 healthy individuals (4), and DGV (5) holds an aggre- 
gated set of control data from ~12 000 healthy individuals. 
DECIPHER (6) and ISCA (7) databases each hold aggregated 
data from ~5000 and ~28 000 phenotypically abnormal 
individuals, respectively. 

Objective 

The Brain and Body Genetic Resource Exchange (BBGRE) 
was set up to provide CNV data and clinical phenotype of 
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patients referred for genetic testing at a single regional 
genetics laboratory. This resource is of particular interest 
to local researchers, as those patients who have given in- 
formed consent are easily accessible; the majority of indi- 
viduals are from a single catchment area and re-contact of 
patients is included on the consent form. The resource may 
also be of particular interest to researchers wishing to con- 
centrate on this geographically defined population. This 
facet of the data may be a disadvantage for the same rea- 
sons depending on the hypothesis in question; however, 
this data set can also be integrated into meta-analyses 
and aggregate studies. 

Specifically, this article describes a database created to 
house this clinical and genetic data, as well as a web-based 
interface for interrogation and interpretation of BBGRE 
data. Furthermore, the website allows registered users to 
submit research project proposals, which are forwarded to 
the steering committee for approval. 

Materials and methods 

CNV detection and interpretation 

We have previously described the implementation of aCGH 
as a test for genome imbalance (8); in short, CNVs are de- 
tected using an Agilent (USA) 60K platform with a median 
resolution of 120kb (AMADID 028469), using a patient ver- 
sus patient hybridization strategy. Full details of protocols, 
analysis and interpretation are presented in Ahn et al. (8). 
Briefly, samples were co-hybridized with other samples mis- 
matched for phenotype and matched for sex. Analysis was 
performed using Agilent algorithm ADM-2, threshold 6 and 
a 3-probe minimum aberration call; a further analysis using 
ADM-1 was carried out to maximize detection of mosaicism 
(9). Imbalances of regions represented in DGV by at least 
three non-BAC-based studies were classified benign. The 
clinical significance of remaining imbalances was assessed 
by examining the functional content of the region of im- 
balance, referencing known benign and pathogenic CNVs 
(ISCA, DECPIHER, OMIM) and an internal clinical database 
of previously tested individuals (Moka). All samples with 
imbalances that were of potential clinical significance 
were re-tested using G-banded karyotyping, QF-PCR, FISH, 
custom MLPA (10) or a repeat array. 

Data collection 

The referral process for clinical aCGH at Guy's Hospital re- 
quires clinicians to fill out a genetic testing referral form, 
and patients provide informed consent on a signed BBGRE 
consent form. The consent form allows patients to specify 
whether they agree to be contacted for research studies 
and/or to allow their DNA sample to be used for research. 
The referral form includes a checklist of referral reasons, 
comprising a comprehensive list of neurodevelopmental 



disorders, as well as a range of other medical and congeni- 
tal conditions (http://bbgre.org/info/accessing-data). The 
checklist is the preferred method for indicating the clinical 
phenotype, although free text entry is also allowed. On 
receipt of samples at the laboratory, details from the refer- 
ral and consent form are recorded in a bespoke LIMS 
(Moka) behind the Guy's & St Thomas' NHS Trust firewall. 

Once aCGH testing is complete, the phenotype, consent 
and CNV data are collected for patients that are found to 
carry potentially clinically significant CNVs. A unique BBGRE 
ID is generated for each patient before the data are 
anonymized. A key table linking the BBGRE ID to patients 
is stored within Moka. The anonymized data are routinely 
transferred via sFTP from Moka to the BBGRE server as flat 
files, mirroring the data structure of the BBGRE database 
tables. A simple script on the BBGRE server is scheduled to 
import any data when received. 

Database design and population 

An overview of the BBGRE infrastructure and data flow is 
shown in Figure 1. The resource comprises a Linux server 
with a MySQL database backend. To populate the BBGRE 
database, each table within Moka is exported and trans- 
ferred as tab-separated flat text files. A Perl script parses 
the flat files to remove trailing spaces, escape special char- 
acters and convert chromosome identifiers into integers 
(X- >23, Y- >24). The script then imports the files into the 
MySQL database. Each table in the BBGRE database is 
deleted and then repopulated with each new Moka ver- 
sion. Gene annotation is taken from HGNC and NCBI 
RefSeq (currently hg 19). Web pages are served up to the 
users' browser via an Apache web server using Perl/CGI and 
jQuery/Ajax. 

Data interrogation 

Users access BBGRE via the BBGRE website, http://bbgre. 
org. Here, users are able to interrogate the database for 
CNV and basic demographic data. The search interface 
allows use of combinations of criteria including genomic 
location, gene content, CNV inheritance patterns and pa- 
tient age and gender. A search with CNV specific criteria 
returns a table of individual CNVs, whereas a search with 
demographic criteria returns a table of patients. These 
tables can be sorted via column headers, and iterative 
searching is also available. To aid further analysis, tables 
can be exported as tab separated file (.tsv) and searches 
can be saved as web browser bookmarks. Each table also 
provides links to patient details pages where all the demo- 
graphic and CNV information for individual patients is dis- 
played. To help with interpretation of these data, links are 
provided to view CNVs in the UCSC Genome Browser (1 1). A 
UCSC session is used to provide a predefined set of tracks 
and viewing configuration that best provides a genomic 
context for the CNV in question. A custom track showing 
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Figure 1. Outline of BBGRE infrastructure. 



all CNVs contained in BBGRE is also displayed to show con- 
text within BBGRE. 

Users are also able to apply for 'advanced user access', 
which enables searching of clinical phenotype data and also 
allows submission of research project proposals. One key 
aspect of this resource is the potential availability of research 
participants within a relatively small geographical area, 
which, for example, would simplify the logistics associated 
with further deep phenotyping of cases. As the steering com- 
mittee acts as the gatekeeper between researchers and pa- 
tients, this registration is necessary. All applications are 
considered by the steering committee via a polling system 
integrated into the BBGRE website. Successful application re- 
sults in advanced user status, which enables the user to access 
clinical phenotype data. These phenotype data can then be 
searched as per the other criteria described earlier in text. 
Advanced users are able to track the status of submitted pro- 
jects as they progress through the approval process. 

Discussion and conclusion 

The current version of the BBGRE database (April 2013) 
contains 4092 cases considered to carry CNVs of potential 
clinical significance, and 4908 imbalances (2429 with inher- 
itance information) derived from a total pool of >10 000 
individuals referred for testing at a single regional genetics 
laboratory at Guys Hospital, London. The database will be 
updated every 6 months as it is expected to grow by 1000 
cases per year. Analytical methods for CNV detection and 



interpretation of imbalances are, therefore, far more con- 
sistent than the aggregated data sets available elsewhere, 
which can be based on different platforms and interpret- 
ation tools. BBGRE, therefore, provides a high-quality data 
set for clinical interrogation and for the basis of research 
studies. Data contained within BBGRE are regularly 
updated as more patients are tested clinically. 

Microdeletions in the NRXN1 gene have been associated 
with a range of neurodevelopmental disorders, including 
autism spectrum disorders, schizophrenia, intellectual dis- 
ability, speech and language delay, epilepsy and hypotonia. 
Using the BBGRE tool, we found exonic deletions in the 
NRXN1 gene, predominantly affecting the alpha isoform, 
in patients with a range of neurodevelopmental disorders 
referred for diagnostic cytogenetic analysis (12). Patients 
have a range of phenotypes including developmental 
delay, learning difficulties, attention-deficit hyperactivity 
disorder, autism, speech delay, social communication diffi- 
culties, epilepsy, behaviour problems and microcephaly. 

Other users of the BBGRE database have also been able 
to identify patients who have contributed to research stu- 
dies. A patient carrying an intragenic deletion of NRXN3 
present in this collection was one of the four index cases 
of such deletions in autism spectrum disorder cases (13). It 
has also facilitated identification of a male sex bias in 46 
patients carrying 16p13.11 CNVs (14). 

By way of example, we describe the search to identify 
patients with NRXN3 imbalances described earlier in text. A 
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user would browse to the main web page http://bbgre.org 
and navigate to the search page using the links on the left 
navigator menu. The gene identifier NRXN3 should then be 
entered into the 'GENE' text box. As the user types the char- 
acters into the text box, suggestions for genes present in the 
database will be offered through a 'predictive text' mechan- 
ism. The user then clicks the 'search' button, and seven pa- 
tients will be returned in the results (data release 2). 
Inspecting the GenomicSize column shows us that six pa- 
tients look to be chromosome 14 trisomy mosaic patients 
due to the size of the imbalance. Clicking on the 'details' 
button for patient BBGREID:1 13108 opens up a new browser 
window that gives more patient details including gender, 
age, age at testing and details of this and any other imbal- 
ances this patient has. Clicking on the 'view' button displays 
this patient's imbalance on the UCSC genome browser along- 
side other patients from BBGRE as well as annotation that 
includes OMIM, DECIPHER and DGV. Having identified the 
patient(s) of interest, the user can then apply for access to 
the phenotypes by registering as an advanced user. 
Application for advanced user status involves clicking on 
the left navigation menu and completing a short web form 
that includes PI contact details, institution and a short project 
summary (<200 words). As well as providing access to the 
phenotypes, the advanced user registration is the route 
through which a user would begin the process of accessing 
the patient for recruitment into further studies. 

A process is underway to label the clinical phenotypes 
within BBGRE according to medical subject headings 
terms (National Library of Medicine). This will shortly be 
incorporated into the web resource, enabling users to per- 
form phenotype searches that exploit the hierarchical 
nature of the medical subject headings definitions. 

In summary, the BBGRE resource provides easy access to 
and user-friendly search of CNV genotype and clinical 
phenotype data from patients referred for genetic testing. 
Initial data mining can be performed on the website and 
the results exported and/or viewed on the UCSC Genome 
Browser for further analysis. 

This resource will prove useful for clinicians and re- 
searchers, and contributes to the understanding of geno- 
type/phenotype correlations and the elucidation of gene 
function. Patients are from a single catchment area, and 
we hope to promote research by facilitating interactions 
between researchers and patients. 
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